16 research outputs found

    MEGA: Multilingual Evaluation of Generative AI

    Full text link
    Generative AI models have shown impressive performance on many Natural Language Processing tasks such as language understanding, reasoning, and language generation. An important question being asked by the AI community today is about the capabilities and limits of these models, and it is clear that evaluating generative AI is very challenging. Most studies on generative LLMs have been restricted to English and it is unclear how capable these models are at understanding and generating text in other languages. We present the first comprehensive benchmarking of generative LLMs - MEGA, which evaluates models on standard NLP benchmarks, covering 16 NLP datasets across 70 typologically diverse languages. We compare the performance of generative LLMs including Chat-GPT and GPT-4 to State of the Art (SOTA) non-autoregressive models on these tasks to determine how well generative models perform compared to the previous generation of LLMs. We present a thorough analysis of the performance of models across languages and tasks and discuss challenges in improving the performance of generative LLMs on low-resource languages. We create a framework for evaluating generative LLMs in the multilingual setting and provide directions for future progress in the field.Comment: EMNLP 202

    End-to-end Privacy Preserving Training and Inference for Air Pollution Forecasting with Data from Rival Fleets

    Get PDF
    Privacy-preserving machine learning (PPML) promises to train machine learning (ML) models by combining data spread across multiple data silos. Theoretically, secure multiparty computation (MPC) allows multiple data owners to train models on their joint data without revealing the data to each other. However, the prior implementations of this secure training using MPC have three limitations: they have only been evaluated on CNNs, and LSTMs have been ignored; fixed point approximations have affected training accuracies compared to training in floating point; and due to significant latency overheads of secure training via MPC, its relevance for practical tasks with streaming data remains unclear. The motivation of this work is to report our experience of addressing the practical problem of secure training and inference of models for urban sensing problems, e.g., traffic congestion estimation, or air pollution monitoring in large cities, where data can be contributed by rival fleet companies while balancing the privacy-accuracy trade-offs using MPC-based techniques. Our first contribution is to design a custom ML model for this task that can be efficiently trained with MPC within a desirable latency. In particular, we design a GCN-LSTM and securely train it on time-series sensor data for accurate forecasting, within 7 minutes per epoch. As our second contribution, we build an end-toend system of private training and inference that provably matches the training accuracy of cleartext ML training. This work is the first to securely train a model with LSTM cells. Third, this trained model is kept secret-shared between the fleet companies and allows clients to make sensitive queries to this model while carefully handling potentially invalid queries. Our custom protocols allow clients to query predictions from privately trained models in milliseconds, all the while maintaining accuracy and cryptographic securit

    ‘Beach’ to ‘Bitch’: Inadvertent Unsafe Transcription of Kids’ Content on YouTube

    No full text
    Over the last few years, YouTube Kids has emerged as one of the highly competitive alternatives to television for children's entertainment. Consequently, YouTube Kids' content should receive an additional level of scrutiny to ensure children's safety. While research on detecting offensive or inappropriate content for kids is gaining momentum, little or no current work exists that investigates to what extent AI applications can (accidentally) introduce content that is inappropriate for kids. In this paper, we present a novel (and troubling) finding that well-known automatic speech recognition (ASR) systems may produce text content highly inappropriate for kids while transcribing YouTube Kids' videos. We dub this phenomenon as inappropriate content hallucination. Our analyses suggest that such hallucinations are far from occasional, and the ASR systems often produce them with high confidence. We release a first-of-its-kind data set of audios for which the existing state-of-the-art ASR systems hallucinate inappropriate content for kids. In addition, we demonstrate that some of these errors can be fixed using language models

    Analysis of thermal comfort properties of tri-layer knitted fabrics

    No full text
    The tri-layer knitted fabrics created for the aim of active sportswear have been enhanced with the help of microdenier filament polyester yarn, spun polyester yarn, polypropylene, and cotton in this study. These developed tri-layer knitted fabrics are then examined for the thermal comfort properties. The results evidently showed that Microdenier Polyester/Microdenier Polyester/Cotton tri-layer knitted fabrics combination shows exceptionally appreciable thermal comfort properties due to their structural factors such as filamentous nature, lesser thickness, low areal density, and lesser bulkiness. The effect of fiber chosen also plays a crucial part with respect to the thermal comfort properties of tri-layer fabrics developed. Samples such as Microdenier Polyester/Polypropylene/Cotton, Polypropylene/Microdenier Polyester/Cotton also performed better next to that of the Microdenier Polyester/Microdenier Polyester/Cotton combination because polypropylene also possesses a good wicking characteristic. A poor thermal behavior was found in the Microdenier Polyester/Cotton/Polypropylene sample because of the reasons such as protruding fibers of cotton, increased thickness, high areal density, etc. Also on comparing between the filament and the spun yarn, the filament yarn is highly recommended due to its appreciable behavior. Results evidently show that Microdenier Polyester/Microdenier Polyester/Cotton combination possesses an exceptionally appreciable thermal comfort property

    Tidal dynamics and rainfall control N<SUB>2</SUB>O and CH<SUB>4</SUB> emissions from a pristine mangrove creek

    No full text
    Dissolved CH4, N2O, O2, and inorganic nitrogen nutrients (NH4+, NO3&#8722; and NO2&#8722;) were measured over tidal cycles in pristine Wright Myo mangrove creek waters during dry and wet seasons. Dissolved CH4 and N2O showed no seasonality (dry season; 491 &#177; 133 nmol CH4 l&#8722;1, 9.0 &#177; 2.3 nmol N2O l&#8722;1, wet season; 466 &#177; 94 nmol CH4 l&#8722;1, 8.6 &#177; 1.3 nmol N2O l&#8722;1). Creek water dissolved gas and inorganic nitrogen distributions reflect sediment porewater release during hydrostatic pressure drop toward low water. Creek water CH4 emission was suppressed by oxidation during rainfall, consistent with changes to dissolved nitrogen speciation, although N2O emissions were unaffected. Scaling up emissions flux estimates from mangrove creek waters and intertidal sediment gives worldwide mangrove emissions ~1.3 &#215; 1011 mol CH4 yr&#8722;1 and 2.7 &#215; 109 mol N2O yr&#8722;1; mangrove ecosystems are thus small contributors to coastal N2O emissions but could dominate coastal CH4 emissions. Comparing our data with mangrove CO2 fluxes, mangrove ecosystems could be small net contributors of atmospheric greenhouse gases

    Serum Ferritin;

    No full text
    The aim of the study is to investigate the levels of hormone, Lipid, Iron and Vitamins in the serum of the primary infertility women. There are many biological causes of infertility, including some that medical intervention can treat. The blood samples collected were analysed for hormone (LH,FSH,PROLACTIN,ESTRADIOL),LIPIDS (cholesterol, triglycerides, HDL VLDL,LDL), Iron, Haemoglobin, Serum, Ferritin and vitamins (D,E,C).The results were analysed with graph pad prism software and are tabulated as follows. The level of LH, FSH, PROLACTIN and ESTRADIOL were found to be increased in the test group on comparison with control. The level of CHOLESTEROL, LDL, VLDL were found to be increased in the test group on comparison with control. The level of TRIGLYCERIDES and HDL were found to be decreased in the test group on comparison with control. The level of IRON and HAEMOGLOBIN and were found to be decreased in the test group on comparison with the control. The level of Serum Ferritin was found to be increased in the test group on comparison with the control. The level of Vitamin D and C were found to be decreased in the test group on comparison with the control. The level of Vitamin E was found to be increased in the test group on comparison with the control
    corecore